Search CORE

43 research outputs found

The Brain on Low Power Architectures - Efficient Simulation of Cortical Slow Waves and Asynchronous States

Author: Ammendola Roberto
Biagioni Andrea
Capuani Fabrizio
Cicero Francesca Lo
Cretaro Paolo
De Bonis Giulia
Lonardo Alessandro
Martinelli Michele
Paolucci Pier Stanislao
Pastorelli Elena
Pontisso Luca
Simula Francesco
Vicini Piero
Publication venue: 'IOS Press'
Publication date: 01/01/2018
Field of study

Efficient brain simulation is a scientific grand challenge, a parallel/distributed coding challenge and a source of requirements and suggestions for future computing architectures. Indeed, the human brain includes about 10^15 synapses and 10^11 neurons activated at a mean rate of several Hz. Full brain simulation poses Exascale challenges even if simulated at the highest abstraction level. The WaveScalES experiment in the Human Brain Project (HBP) has the goal of matching experimental measures and simulations of slow waves during deep-sleep and anesthesia and the transition to other brain states. The focus is the development of dedicated large-scale parallel/distributed simulation technologies. The ExaNeSt project designs an ARM-based, low-power HPC architecture scalable to million of cores, developing a dedicated scalable interconnect system, and SWA/AW simulations are included among the driving benchmarks. At the joint between both projects is the INFN proprietary Distributed and Plastic Spiking Neural Networks (DPSNN) simulation engine. DPSNN can be configured to stress either the networking or the computation features available on the execution platforms. The simulation stresses the networking component when the neural net - composed by a relatively low number of neurons, each one projecting thousands of synapses - is distributed over a large number of hardware cores. When growing the number of neurons per core, the computation starts to be the dominating component for short range connections. This paper reports about preliminary performance results obtained on an ARM-based HPC prototype developed in the framework of the ExaNeSt project. Furthermore, a comparison is given of instantaneous power, total energy consumption, execution time and energetic cost per synaptic event of SWA/AW DPSNN simulations when executed on either ARM- or Intel-based server platforms

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

Gaussian and exponential lateral connectivity on distributed spiking neural network simulation

Author: Ammendola Roberto
Biagioni Andrea
Capuani Fabrizio
Cicero Francesca Lo
Cretaro Paolo
De Bonis Giulia
Lonardo Alessandro
Martinelli Michele
Paolucci Pier Stanislao
Pastorelli Elena
Pontisso Luca
Simula Francesco
Vicini Piero
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

We measured the impact of long-range exponentially decaying intra-areal lateral connectivity on the scaling and memory occupation of a distributed spiking neural network simulator compared to that of short-range Gaussian decays. While previous studies adopted short-range connectivity, recent experimental neurosciences studies are pointing out the role of longer-range intra-areal connectivity with implications on neural simulation platforms. Two-dimensional grids of cortical columns composed by up to 11 M point-like spiking neurons with spike frequency adaption were connected by up to 30 G synapses using short- and long-range connectivity models. The MPI processes composing the distributed simulator were run on up to 1024 hardware cores, hosted on a 64 nodes server platform. The hardware platform was a cluster of IBM NX360 M5 16-core compute nodes, each one containing two Intel Xeon Haswell 8-core E5-2630 v3 processors, with a clock of 2.40 G Hz, interconnected through an InfiniBand network, equipped with 4x QDR switches.Comment: 9 pages, 9 figures, added reference to final peer reviewed version on conference paper and DO

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

Real-time cortical simulations: energy and interconnect scaling on distributed systems

Author: Ammendola Roberto
Biagioni Andrea
Capone Cristiano
Capuani Fabrizio
Cicero Francesca Lo
Cretaro Paolo
De Bonis Giulia
Lonardo Alessandro
Martinelli Michele
Paolucci Pier Stanislao
Pastorelli Elena
Pontisso Luca
Simula Francesco
Vicini Piero
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

We profile the impact of computation and inter-processor communication on the energy consumption and on the scaling of cortical simulations approaching the real-time regime on distributed computing platforms. Also, the speed and energy consumption of processor architectures typical of standard HPC and embedded platforms are compared. We demonstrate the importance of the design of low-latency interconnect for speed and energy consumption. The cost of cortical simulations is quantified using the Joule per synaptic event metric on both architectures. Reaching efficient real-time on large scale cortical simulations is of increasing relevance for both future bio-inspired artificial intelligence applications and for understanding the cognitive functions of the brain, a scientific quest that will require to embed large scale simulations into highly complex virtual or real worlds. This work stands at the crossroads between the WaveScalES experiment in the Human Brain Project (HBP), which includes the objective of large scale thalamo-cortical simulations of brain states and their transitions, and the ExaNeSt and EuroExa projects, that investigate the design of an ARM-based, low-power High Performance Computing (HPC) architecture with a dedicated interconnect scalable to million of cores; simulation of deep sleep Slow Wave Activity (SWA) and Asynchronous aWake (AW) regimes expressed by thalamo-cortical models are among their benchmarks.Comment: 8 pages, 8 figures, 4 tables, submitted after final publication on PDP2019 proceedings, corrected final DOI. arXiv admin note: text overlap with arXiv:1812.04974, arXiv:1804.0344

arXiv.org e-Print Archive

Crossref

Archivio della ricerca- Università di Roma La Sapienza

TEXTAROSSA: Towards EXtreme scale Technologies and Accelerators for euROhpc hw/Sw Supercomputing Applications for exascale

Author: Agosta Giovanni
Aldinucci Marco
Alvarez Carlos
Ammendola Roberto
Arfat Yasir
Beaumont Olivier
Bernaschi Massimo
Biagioni Andrea
Boccali Tommaso
Bramas Bérenger
Brandolese Carlo
Cantalupo Barbara
Cattaneo Daniele
Celino Massimo
Colonnelli Iacopo
Cretaro Paolo
d'Ambra Pasqua
Danelutto Marco
Esposito Roberto
Eyraud-Dubois Lionel
Filgueras Antonio
Fornaciari William
Frezza Ottorino
Galimberti Andrea
Giacomini Francesco
Goglin Brice
Guermouche Abdou
Iannone Francesco
Kulczewski Michal
Lo Cicero Francesca
Lonardo Alessandro
Martinelli Alberto,
Martorell Xavier
Massari Giuseppe
Mittone Gianluca
Montangero Simone
Namyst Raymond
Oleksiak Ariel
Palazzari Paolo
Reghenzani Federico
Saporana Sergio
Simula Francesca
Stanislao Paolucci Pier
Terraneo Federico
Thibault Samuel
Torquati Massimo
Turisini Matteo
Vicini Piero
Vidal Miquel
Zoni Davide
Zummo Giuseppe
Publication venue: HAL CCSD
Publication date: 01/09/2021
Field of study

International audienceTo achieve high performance and high energy efficiency on near-future exascale computing systems, three key technology gaps needs to be bridged. These gaps include: energy efficiency and thermal control; extreme computation efficiency via HW acceleration and new arithmetics; methods andtools for seamless integration of reconfigurable accelerators in heterogeneous HPC multi-node platforms. TEXTAROSSA aims at tackling this gap through a co-design approach to heterogeneous HPC solutions, supported by the integration and extension of HW and SW IPs, programming models and tools derived from European research

INRIA a CCSD electronic archive server

The Latest Version of SiFAP: Beyond Microsecond Time Scale Photometry of Variable Objects

Author: Ambrosino F.
Bruni I.
CRETARO PAOLO
MEDDI Franco
ROSSI Corinne
SCLAVI SILVIA
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 01/01/2016
Field of study

Technical improvements of the Silicon Fast optical Astronomical Photometer (SiFAP) allow the instrumentation to integrate photons coming from the target in time windows down to 20μs. Further hardware improvement has been implemented to tag the Time of Arrival (ToA) of each single photon. In addition, a new commercial GPS unit has replaced the older commercial unit improving time resolution. The latest version of SiFAP has been calibrated to check photometric sensitivity and linearity through observations of several standard stars. SiFAP has been also successfully tested by observing the HZ/Her X-1 Binary System estimating the spin period of the pulsar (Her X-1). Our results have been then compared to data available in literature

Archivio della ricerca- Università di Roma La Sapienza

EuroEXA Custom Switch: an innovative FPGA-based system for extreme scale computing in Europe

Author: Biagioni Andrea
Cretaro Paolo
Frezza Ottorino
Lo Cicero Francesca
Lonardo Alessandro
Paolucci Pier Stanislao
Pontisso Luca
Simula Francesco
Vicini Piero
Publication venue: 'EDP Sciences'
Publication date: 01/01/2020
Field of study

EuroEXA is a major European FET research initiative that aims to deliver a proof-of-concept of a next generation Exa-scalable HPC platform. EuroEXA leverages on previous projects results (ExaNeSt, ExaNoDe and ECOSCALE) to design a medium scale but scalable, fully working HPC system prototype exploiting state-of-the-art FPGA devices that integrate compute accelerators and low-latency high-throughputnetwork. Exascale-class systems are expected to host a very large number of computing nodes, from 104 up to 105, so that capability and performances of the interconnect architecture are critical to achieve high computing efficiency at this scale. In this perspective, EuroEXA enhances the ExaNet architecture, inherited by the ExaNeSt project, and introduces a multi-tier, hybrid topology network built on top of an FPGA-integrated Custom Switch that provides high throughput and low inter-node traffic latency for the different layers of the network hierarchy. Deployment of a few testbeds is planned, with incremental complexity and equipped with complete software stack and runtime environment, to support the integration and test of the network design and to allow for evaluation of system performance and scalability through benchmarks based on real HPC applications. Design and integration activities are ongoing and the first small scale prototype (50 nodes) is expected to be completed in fall 2020 followed, one year later, by the deployment of the larger prototype (250/500 nodes)

Directory of Open Access Journals

EuroEXA Custom Switch: an innovative FPGA-based system for extreme scale computing in Europe

Author: Alessandro Lonardo
Andrea Biagioni
Francesca Lo Cicero
Francesco Simula
Luca Pontisso
Ottorino Frezza
Paolo Cretaro
Pier Stanislao Paolucci
Piero Vicini
Publication venue: 'EDP Sciences'
Publication date: 16/11/2020
Field of study

EDP Sciences OAI-PMH repository (1.2.0)

L0TP+: the Upgrade of the NA62 Level-0 Trigger Processor

Author: Ammendola Roberto
Biagioni Andrea
Ciardiello Andrea
Cretaro Paolo
Frezza Ottorino
Lamanna Gianluca
Lo Cicero Francesca
Lonardo Alessandro
Piandani Roberto
Pontisso Luca
Salamon Andrea
Simula Francesco
Soldi Dario
Sozzi Marco
Vicini Piero
Publication venue: 'EDP Sciences'
Publication date: 01/01/2020
Field of study

The L0TP+ initiative is aimed at the upgrade of the FPGA-based Level-0 Trigger Processor (L0TP) of the NA62 experiment at CERN for the post-LS2 data taking, which is expected to happen at 100% of design beam intensity, corresponding to about 3.3 × 1012 protons per pulse on the beryllium target used to produce the kaons beam. Although tests performed at the end of 2018 showed a substantial robustness of the L0TP system also at full beam intensity, there are several reasons to motivate such an upgrade: i) avoid FPGA platform obsolescence, ii) make room for improvements in the firmware design leveraging a more capable FPGA device, iii) add new functionalities, iv) support the 4 beam intensity increase foreseen in future experiment upgrades. We singled out the Xilinx Virtex UltraScale+ VCU118 development board as the ideal platform for the project. L0TP+ seamless integration into the current NA62 TDAQ system and exact matching of L0TP functionalities represent the main requirements and focus of the project; nevertheless, the final design will include additional features, such as a PCIe RDMA engine to enable processing on CPU and GPU accelerators, and the partial reconfiguration of trigger firmware starting from a high level language description (C/C++). The latter capability is enabled by modern High Level Synthesis (HLS) tools, but to what extent this methodology can be applied to perform complex tasks in the L0 trigger, with its stringent latency requirements and the limits imposed by single FPGA resources, is currently being investigated. As a test case for this scenario we considered the online reconstruction of the RICH detector rings on an HLS generated module, using a dedicated primitives data stream with PM hits IDs. Besides, the chosen platform supports the Virtex Ultrascale+ FPGA wide I/O capabilities, allowing for straightforward integration of primitive streams from additional sub-detectors in order to improve the performance of the trigger

Directory of Open Access Journals

L0TP+: the Upgrade of the NA62 Level-0 Trigger Processor

Author: Alessandro Lonardo
Andrea Biagioni
Andrea Ciardiello
Andrea Salamon
Dario Soldi
Francesca Lo Cicero
Francesco Simula
Gianluca Lamanna
Luca Pontisso
Marco Sozzi
Ottorino Frezza
Paolo Cretaro
Piero Vicini
Roberto Ammendola
Roberto Piandani
Publication venue: 'EDP Sciences'
Publication date: 17/11/2020
Field of study

EDP Sciences OAI-PMH repository (1.2.0)

The Next Generation of Exascale-Class Systems:The ExaNeSt Project

Author: Ammendola Roberto
Biagioni Andrea
Chrysos Nikolaos
Cretaro Paolo
Frezza Ottorino
Goodacree John
Katevenis Manolis
Lo Cicero Francesca
Lonardo Alessandro
Luján Mikel
Martinelli Michele
Navaridas Javier
Paolucci Pier Stanislao
Pascual Jose A.
Pastorelli Elena
Simula Francesco
Taffoni Giuliano
Vicini Piero
Publication venue: IEEE
Publication date: 01/01/2017
Field of study

The ExaNeSt project started on December 2015 and is funded by EU H2020 research framework (call H2020-FETHPC-2014, n. 671553) to study the adoption of low-cost, Linux-based power-efficient 64-bit ARM processors clusters for Exascale-class systems. The ExaNeSt consortium pools partners with industrial and academic research expertise in storage, interconnects and applications that share a vision of an European Exascale-class supercomputer. Their goal is designing and implementing a physical rack prototype together with its cooling system, the storage non-volatile memory (NVM) architecture and a low-latency interconnect able to test different options for interconnection and storage. Furthermore, the consortium is to provide real HPC applications to validate the system. Herein we provide a status report of the project initial developments.To appear in: the Proceedings of the Euromicro Conference on Digital System Design (DSD 2017), Vienna, Austria, 30 August - 1 September, 201

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

The University of Manchester - Institutional Repository

Archivio della ricerca- Università di Roma La Sapienza